Search CORE

9 research outputs found

Finding Subcube Heavy Hitters in Analytics Data Streams

Author: Kveton Branislav
Muthukrishnan S.
Vu Hoa T.
Xian Yikun
Publication venue
Publication date: 01/01/2018
Field of study

Data streams typically have items of large number of dimensions. We study the fundamental heavy-hitters problem in this setting. Formally, the data stream consists of

d

-dimensional items

x_1,\ldots,x_m \in [n]^d

. A

k

-dimensional subcube

T

is a subset of distinct coordinates

\{ T_1,\cdots,T_k \} \subseteq [d]

. A subcube heavy hitter query

{\rm Query}(T,v)

v \in [n]^k

, outputs YES if

f_T(v) \geq \gamma

and NO if

f_T(v) < \gamma/4

, where

f_T

is the ratio of number of stream items whose coordinates

T

have joint values

v

. The all subcube heavy hitters query

{\rm AllQuery}(T)

outputs all joint values

v

that return YES to

{\rm Query}(T,v)

. The one dimensional version of this problem where

d=1

was heavily studied in data stream theory, databases, networking and signal processing. The subcube heavy hitters problem is applicable in all these cases. We present a simple reservoir sampling based one-pass streaming algorithm to solve the subcube heavy hitters problem in

\tilde{O}(kd/\gamma)

space. This is optimal up to poly-logarithmic factors given the established lower bound. In the worst case, this is

\Theta(d^2/\gamma)

which is prohibitive for large

d

, and our goal is to circumvent this quadratic bottleneck. Our main contribution is a model-based approach to the subcube heavy hitters problem. In particular, we assume that the dimensions are related to each other via the Naive Bayes model, with or without a latent dimension. Under this assumption, we present a new two-pass,

\tilde{O}(d/\gamma)

-space algorithm for our problem, and a fast algorithm for answering

{\rm AllQuery}(T)

O(k/\gamma^2)

time. Our work develops the direction of model-based data stream analysis, with much that remains to be explored.Comment: To appear in WWW 201

arXiv.org e-Print Archive

Crossref

Reinforcement Knowledge Graph Reasoning for Explainable Recommendation

Author: de Melo Gerard
Fu Zuohui
Muthukrishnan S.
Xian Yikun
Zhang Yongfeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/06/2019
Field of study

Recent advances in personalized recommendation have sparked great interest in the exploitation of rich structured information provided by knowledge graphs. Unlike most existing approaches that only focus on leveraging knowledge graphs for more accurate recommendation, we perform explicit reasoning with knowledge for decision making so that the recommendations are generated and supported by an interpretable causal inference procedure. To this end, we propose a method called Policy-Guided Path Reasoning (PGPR), which couples recommendation and interpretability by providing actual paths in a knowledge graph. Our contributions include four aspects. We first highlight the significance of incorporating knowledge graphs into recommendation to formally define and interpret the reasoning process. Second, we propose a reinforcement learning (RL) approach featuring an innovative soft reward strategy, user-conditional action pruning and a multi-hop scoring function. Third, we design a policy-guided graph search algorithm to efficiently and effectively sample reasoning paths for recommendation. Finally, we extensively evaluate our method on several large-scale real-world benchmark datasets, obtaining favorable results compared with state-of-the-art methods.Comment: Accepted in SIGIR 201

arXiv.org e-Print Archive

Crossref

ABSent: Cross-Lingual Sentence Representation Mapping with Bidirectional GANs

Author: de Melo Gerard
Dong Xin
Fu Zuohui
Ge Yingqiang
Geng Shijie
Wang Guang
Wang Yuting
Xian Yikun
Publication venue
Publication date: 29/01/2020
Field of study

A number of cross-lingual transfer learning approaches based on neural networks have been proposed for the case when large amounts of parallel text are at our disposal. However, in many real-world settings, the size of parallel annotated training data is restricted. Additionally, prior cross-lingual mapping research has mainly focused on the word level. This raises the question of whether such techniques can also be applied to effortlessly obtain cross-lingually aligned sentence representations. To this end, we propose an Adversarial Bi-directional Sentence Embedding Mapping (ABSent) framework, which learns mappings of cross-lingual sentence representations from limited quantities of parallel data

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

CAFE: Coarse-to-Fine Neural Symbolic Reasoning for Explainable Recommendation

Author: Chen Xu
de Melo Gerard
Fu Zuohui
Ge Yingqiang
Geng Shijie
Huang Qiaoying
Muthukrishnan S.
Qin Zhou
Xian Yikun
Zhang Yongfeng
Zhao Handong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/10/2020
Field of study

Recent research explores incorporating knowledge graphs (KG) into e-commerce recommender systems, not only to achieve better recommendation performance, but more importantly to generate explanations of why particular decisions are made. This can be achieved by explicit KG reasoning, where a model starts from a user node, sequentially determines the next step, and walks towards an item node of potential interest to the user. However, this is challenging due to the huge search space, unknown destination, and sparse signals over the KG, so informative and effective guidance is needed to achieve a satisfactory recommendation quality. To this end, we propose a CoArse-to-FinE neural symbolic reasoning approach (CAFE). It first generates user profiles as coarse sketches of user behaviors, which subsequently guide a path-finding process to derive reasoning paths for recommendations as fine-grained predictions. User profiles can capture prominent user behaviors from the history, and provide valuable signals about which kinds of path patterns are more likely to lead to potential items of interest for the user. To better exploit the user profiles, an improved path-finding algorithm called Profile-guided Path Reasoning (PPR) is also developed, which leverages an inventory of neural symbolic reasoning modules to effectively and efficiently find a batch of paths over a large-scale KG. We extensively experiment on four real-world benchmarks and observe substantial gains in the recommendation performance compared with state-of-the-art methods.Comment: Accepted in CIKM 202

arXiv.org e-Print Archive

Crossref

Semantic Search with Information Integration

Author: Xian Yikun
Zhang Liu
Publication venue: Linnéuniversitetet, Institutionen för datavetenskap, fysik och matematik, DFM
Publication date: 01/01/2011
Field of study

Since the search engine was first released in 1993, the development has never been slow down and various search engines emerged to vied for popularity. However, current traditional search engines like Google and Yahoo! are based on key words which lead to results impreciseness and information redundancy. A new search engine with semantic analysis can be the alternate solution in the future. It is more intelligent and informative, and provides better interaction with users. This thesis discusses the detail on semantic search, explains advantages of semantic search over other key-word-based search and introduces how to integrate semantic analysis with common search engines. At the end of this thesis, there is an example of implementation of a simple semantic search engine

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Linnéuniversitetets forskningsdatabas